Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update GRROsqueryCollector to use threadpoolexecutor #696

Merged
merged 5 commits into from
Jan 4, 2023

Conversation

sydp
Copy link
Collaborator

@sydp sydp commented Dec 22, 2022

As suggested, using threadpoolexecutor to run osquery in multiple threads.

@sydp sydp marked this pull request as ready for review December 22, 2022 11:26
@sydp sydp self-assigned this Dec 22, 2022
@sydp sydp requested review from ramo-j and tomchop December 22, 2022 11:27
dftimewolf/lib/collectors/grr_hosts.py Outdated Show resolved Hide resolved
dftimewolf/lib/collectors/grr_hosts.py Outdated Show resolved Hide resolved
dftimewolf/lib/collectors/grr_hosts.py Show resolved Hide resolved
dftimewolf/lib/collectors/grr_hosts.py Outdated Show resolved Hide resolved
@sydp sydp requested a review from tomchop December 27, 2022 20:11
dftimewolf/lib/collectors/grr_hosts.py Outdated Show resolved Hide resolved
dftimewolf/lib/collectors/grr_hosts.py Show resolved Hide resolved
Comment on lines +1019 to +1027
results_container = containers.OsqueryResult(
name=name,
description=description,
query=query,
hostname=hostname,
data_frame=pd.DataFrame(),
flow_identifier=flow_identifier,
client_identifier=client_identifier)
self.state.StoreContainer(results_container)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the point in storing an empty dataframe here? Wouldn't it be better just to not store any container?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have it as an empty dataframe as the corresponding container attribute is currently not optional. It also simplifies the logic in downstream processing of the container.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My question was more "why add a container at all" if the dataframe is going to be empty anyways.

Copy link
Collaborator Author

@sydp sydp Dec 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops my bad, missed the second question. My rationale doing it this way was no result (i.e. empty data) is still a result and is useful feedback downstream to let the module/user know that the query was successful and there was no result.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, that makes sense, thanks!

Comment on lines +1019 to +1027
results_container = containers.OsqueryResult(
name=name,
description=description,
query=query,
hostname=hostname,
data_frame=pd.DataFrame(),
flow_identifier=flow_identifier,
client_identifier=client_identifier)
self.state.StoreContainer(results_container)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, that makes sense, thanks!

@tomchop tomchop merged commit 0faedec into log2timeline:main Jan 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants